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Abstract 

We consider Markov chain Monte Carlo methods for calculating condi¬ 
tional p values of statistical models for count data arising in Box-Behnken 
designs. The statistical model we consider is a discrete version of the first- 
order model in the response surface methodology. For our models, the Markov 
basis, a key notion to construct a connected Markov chain on a given sam¬ 
ple space, is characterized as generators of the toric ideals for the centrally 
symmetric configurations of root system D n . We show the structure of the 
Grobner bases for these cases. A numerical example for an imaginary data 
set is given. 


1 Introduction 

After the work by Diaconis and Sturmfels (pH]), a Markov basis, a key notion in the 
field of computational algebraic statistics, has attracted special attentions among 
researchers both in statistics and algebra. In this first work, they show the funda¬ 
mental relation between the generators of toric ideals and the Markov bases and 
establish a procedure for sampling from discrete conditional distributions by con¬ 
structing an irreducible Markov chain on a given sample space. By virtue of this 
relation, we can perform Markov chain Monte Carlo methods to estimate conditional 
p values for various statistical problems if we can obtain the generator of correspond¬ 
ing toric ideals. Readers can find various theoretical results on structure of Markov 

* Graduate School of Science and Engineering (Science Course), Kagoshima University. 

^Department of Pure and Applied Mathematics, Graduate School of Information Science and 
Technology, Osaka University 

•^Department of Mathematical Sciences, School of Science and Technology, Kwansei Gakuin 
University 


1 



bases such as minimality or invariance, and Markov bases of important statistical 
models such as hierarchical models of multi-dimensional contingency tables in [2j. 

In parallel, it is also valuable to connect known classes of toric ideals to statistical 
models. Such a motivation yields attractive research topics from algebraic fields to 
statistics. For example, |S] shows the relation between the generator of the toric 
ideals for the Segre-Veronese configuration and the special independence models in 
the testing problems of group-wise selections. This result is further generalized to 
a class of configurations called nested configurations in [4]. As another example, 
relations between regular two-level fractional factorial designs and cut ideals are 
shown in [3j. The arguments in this paper comes from the same motivation to these 
works. 

In this paper, we consider the statistical models corresponding to the algebraic 
object known as a centrally symmetric configuration of root system D n . The notion 
of centrally symmetric configurations ([T5jj) is one of the new attractive topics in 
algebra since it yields many “toric rings” that have important algebraic properties 
(normal and Gorenstein). See [15] . On the other hand, Grobner bases of the toric 
ideal arising from the configuration of D n is studied in [14] . In addition, convex 
polytopes arising from the centrally symmetric configuration of D n are studied in 
[5]. In this paper, we show that the centrally symmetric configuration corresponds 
to the first-order models for the symmetric designs of experiments for multi-level 
factors. As typical examples of such designs, we consider Box-Behnken designs in 
this paper. Markov chain Monte Carlo procedure in the framework of design of 
experiments is introduced in [fj and [6] . In these works, regular two-level and three- 
level designs are considered. However, non-regular designs are difficult to treat in 
general. In this paper, we present a new method for analyzing non-regular designs. 

The construction of this paper is as follows. In Section 2, we review the Markov 
chain Monte Carlo methods for design of experiments. In Section 3, we give a 
definition of the centrally symmetric configuration and present statistical models. 
We also introduce the Box-Behnken designs and show that the model matrix of 
the first-order models for the Box-Behnken designs corresponds to the centrally 
symmetric configuration of root system D n . In Section 4, we give the Grobner bases 
of the centrally symmetric configuration of root system D n . In Section 5, we give 
numerical example for an imaginary data set. Finally, we give some discussion in 
Section 6. 


2 Markov chain Monte Carlo methods for design 
of experiments 

In this section, we introduce Markov chain Monte Carlo methods for testing the 
fitting of the log-linear models for fractional factorial designs with count observa¬ 
tions. We consider the designs with m controllable factors. For j = 1 ,,m, write 
Aj 6 Q as the level of the j'-th factor. For example, if j'-th factor has three levels, 
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it is common to write Aj = {—1, 0,1}. The full factorial design P C Q m is given as 
P = A 1 x • • • x A m and the fractional factorial design F is a subset of P. Suppose 
there are k runs (i.e., points) in F. For convenience, we order the points of F ap¬ 
propriately and consider a k x m design matrix D = (d t j), where is the level of 
j-th factor in z-th run for i — 1,..., k, j — 1,..., m. 

We write the observations as y = (z/i,... ,yk)', where ' denotes the transpose. 
In this paper, we consider the Poisson sampling scheme, i.e., we suppose that the 
observations are counts of some events and the observation y is nonnegative integer 
vector. We also suppose that only one observation is obtained for each run. This is 
a natural setting because the set of the totals for each run is the sufficient statistics 
for the parameter in the Poisson sampling scheme. Therefore the observation y are 
realizations from k mutually independent Poisson random variables Tj..... W with 
the mean parameter A* = E(Yj), i — 1,..., k. 

We consider the log-linear model 

log A j = A) + P&n H-h finXin, i = 1, • • •, k (1) 


for the parameter A i,i — 1,..., k, where x tJ is a j-th covariate for the z-th run and 
n + 1 is the dimension of the parameter ft = (/3 0 , (3 1 ,..., /3 n )'. If we write x i0 = 1 for 
i — 1,..., k, the log-linear model (JTJ) is written as 

(log Ai,.. • ,log A fc )' = M/3, 


where M = (xij)i=i t ... t k;j=o,...,n- We call a k x (n + 1) matrix M as a model matrix of 
the log-linear model (IT]) . 

To judge the fitting of the log-linear model (JT[), we can consider various goodness- 
of-fit tests. In the goodness-of-fit tests, the model dU) is treated as the null model, 
whereas the saturated model is treated as the alternative model. Under the null 
model the sufficient statistics for the (nuisance) parameter /3 is given by M'y 
from the factorization 


n 



IIT7T h*P p'V'y-X> 




Vi 


i= 1 


Therefore the conditional distribution of y for the given sufficient statistics is written 
as 

k 

/(y I M'y = M'y") = C(M'y-)-' Tl V ( 2 ) 

A Vi 

where y° is the observation vector and C(M'y°) is the normalizing constant deter¬ 
mined from the sufficient statistics M'y° as 


C(M'y°) 


E 

y€T(M'y°) 



(3) 
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and 


HM' y°) = {ye z| 0 | M'y = M' y °}. (4) 

In this paper, we consider goodness-of-fit tests based on the conditional distri¬ 
bution ([2]). There are several choices of the test statistics T(y). Frequently used 
choices are the likelihood ratio statistics 


k 

T{y ) = 2 2 /i log ^ 

i =1 Ai 


(5) 


or the Pearson y 2 statistics 


r(y) = £ 


(s/i - Aj ) 2 

A,: 


where A* is the maximum likelihood estimate for A* under the null model (i.e., fitted 
value). A simple way of judging the significance for the observed value T(y°) is 
the asymptotic p value based on the asymptotic distribution xt-n-i °f the test 
statistics. However, the fitting of the asymptotic approximation may be sometimes 
poor. Therefore we consider conditional exact p values in this paper. Based on the 
conditional distribution (j2j) , the exact conditional p value is written as 

P= £ /(y | M'y = M'y”)l(T(y) > T(y")), (6) 

y&T(M'y°) 


where 


l(T(y) > T(y°)) 


1, if T(y) > T(y°), 
0, otherwise 


(7) 


is the test function of T(y). Of course, if we can calculate the exact p value of 
m and ©.it is best. However, the cardinality of the set T{M' y°) becomes huge 
for moderate sizes of data and the calculation of the normalizing constant C(M'y°) 
of (|2I) is usually computationally infeasible. Instead, we consider a Markov chain 
Monte Carlo method to evaluate the conditional p values. It should be noted that 
we need not calculate the normalizing constant (j3j) to evaluate the p values by the 
Markov chain Monte Carlo methods. This point is one of the important advantages 
of the Markov chain Monte Carlo methods. 

To perform the Markov chain Monte Carlo procedure, we have to construct an 
irreducible Markov chain over the conditional sample space (J3J) with the stationary 
distribution ([2]). If such a chain is constructed, we can sample from the chain as 
yW,...., yT) after discarding some initial burn-in steps, and estimate the p values 
as 

p=4£ 1 (T(y |,) >T(y“)). 

t= 1 
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Such a chain can be constructed easily by Markov bases. Once a Markov basis is 
obtained, we can construct a connected, aperiodic and reversible Markov chain over 
the conditional sample space (pi]), which can be modified so as to have the stationary 
distribution (]2]) by the Metropolis-Hastings procedure. See HQ or mu for details. 

The Markov basis is characterized algebraically as follows. Write the variables 
xi,... ,Xk and consider the polynomial ring K[x±,, Xk ] over a held K. Consider 
the integer kernel of the transpose of the model matrix iff, Ker %M'. For each 
b = bk)' € Ker z M', define a binomial in K[x i,..., xjf\ as 

/b=n~ n ■ 

bi> 0 bi< 0 

Then the binomial ideal in K[x i,..., X&], 

Im> = ({/b I b e Ker z M'}) 

is called a toric ideal of the configuration M'. Then for a generating set of Im 1 , 
{/ b (i>,..., / b (s)}, the set of integer vectors {b ( b ; ... 5 b^} constitutes a Markov basis. 
See [IQ] for details. 

3 Statistical models of the centrally symmetric 
configurations and Box-Behnken designs 

As we have seen in Section 2, if we can obtain a generator of Im 1 , a toric ideal of 
the configuration M', we can judge the fitting of the statistical model expressed by 
the model matrix M by the conditional p values estimated by the Markov chain 
Monte Carlo methods. For small sizes of problems, we can rely on various softwares 
such as 4ti2 (|Tj) to compute generators of the toric ideals. However, for problems 
of large sizes, it is usually very difficult to compute Markov bases or Grobner bases 
for given ideals. On the other hand, if we have theoretical results on the structure 
of the corresponding ideals, it is very easy to perform the Markov chain Monte 
Carlo procedure for such configurations. The centrally symmetric configuration is 
an example of such cases. 

The centrally symmetric configuration is given in ([13]) as follows. Let A G Z nxs 
be an integer matrix for which no column vector is a zero vector. Then the centrally 
symmetric configuration of A is the (n + 1) x (2s + 1) integer matrix 



( ° 



A ± = 

0 

A 

-A 


l 1 

1 ••• 1 

1 ... i 


It is known that the “toric ring” A'[A 1 * 1 ] of A ± is normal and Gorenstein if there 
exists a squarefree initial ideal with respect to a reverse lexicographic order where 
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the smallest variable corresponds to the first column of A ± . See, e.g., [121 Lemma 
!■!]■ 

As natural statistical models and designs where (the transpose of) the model ma¬ 
trix is centrally symmetric configurations, we consider the simple first-order models 
as follows. Suppose F C A\ x ■ ■ ■ A m e Q m is a symmetric design including the 
origin , i.e., a design satisfying 


(0,..., 0) e F and d e F => —d e F. 


Write D = (d t j) e Z fcxm its design matrix, where dij is the level of j-th factor in 
2 -th run for i — 1,..., k, j = 1,... ,m. Then we see that the transpose of the model 
matrix 


/ 1 \ 


M = 


D 


(9) 


V 1 


is centrally symmetric. Corresponding log-linear model (JTJ) is written as 


log At /?o T lAdi\ T ■ ■ ■ T fdmdimi l 1, ..., k. (19) 

We call the model (TTOl) as a first-order model in this paper. The interpretation of 
the first-order model (fill is as follows. Suppose there are adequate meanings both 
in the order of the factors and the interval of the factors for the design F. Then 
the model (fTOl) means that the logarithm of the influence to the response variable 
is proportional to the difference of the levels for each factor. In other words, the 
parameter ffi represents the main effect of the j'-tli factor for j = 1 ,,m. The 
first order model (fTUl) is a discrete version of the first-order model arising in the 
context of the response surface methodology. See Section 9 of SSI, for example. 
A typical example of the symmetric designs is also arising in the context of the 
response surface methodology as Box-Behnken designs. 

The Box-Behnken design is a family of three-level fractional factorial designs 
introduced by |9]. This design is constructed by combining two-level factorial de¬ 
signs with balanced (or partially balanced) incomplete block designs in a particular 
manner. To illustrate the concept of the Box-Behnken designs, consider the case of 
three factors (i.e., m — 3). A balanced incomplete block design with three factors 
and three blocks is given as follows. 


Block 

Factor 
12 3 

1 

o o 

2 

o o 

3 

o o 


The Box-Behnken design is constructed by replacing the two circles (o) in each 
block by the two columns of the two-level 2 2 design and add a column of zeros 
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where a circle does not appear, and adding a run at the origin. In this example, the 
Box-Behnken design is constructed as follows. 


Factor 

1 2 

3 

-1 

-1 

0 

-1 

1 

0 

1 

-1 

0 

1 

1 

0 

-1 

0 

-1 

-1 

0 

1 

1 

0 

-1 

1 

0 

1 

0 

-1 

-1 

0 

-1 

1 

0 

1 

-1 

0 

1 

1 

0 

0 

0 


Similarly, by combining various incomplete block designs with two-level full (or frac¬ 
tional) factorial designs, various three-level fractional factorial designs are obtained. 
In this paper, we only consider the Box-Behnken designs constructed from the two- 
level 2 2 design and the balanced incomplete block designs with the block size 2, the 
number of factors (or treatments) m, the number of blocks m{m — l)/2 and the 
number of replicates for each factor m — 1, with a single run at the origin. Note that 
it is common to consider the designs with several runs at the origins in this field. 
See [9] or Chapter 9 of HE! for details. However, we only consider the designs with 
single observations even in the origin. Therefore the Box-Behnken design considered 
in this paper has 2 m{m — 1) + 1 runs for m factor case. 

For these Box-Behnken designs, we consider the first-order model (TTOj) with the 
model matrix (|9]). Note that n = m in our cases. Then we see that the transpose of 
the model matrix, M', has the centrally symmetric structure (jH]) with s = m{m— 1). 
For example, the transpose of the model matrix of the first-order model for the three 


factors case 

is given by 












( 0 

-l -l 

1 

1 

-1 

-1 

1 

1 

0 

0 

0 

0 \ 


0 

-l i 

-1 

1 

0 

0 

0 

0 

-1 

-1 

1 

1 


0 

0 0 

0 

0 

-1 

1 

-1 

1 

-1 

1 

-1 

1 


V 1 

1 1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

!/ 


which is the centrally symmetric configuration of 

/ —1 —1 —1 — 1 0 0 \ 
-1 1 0 0 - 1-1 

\ 0 0-1 1-1 1 / 
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Following the arguments of Section 2, we can judge the fitting of the first-order 
model for the Box-Behnken designs by the Markov chain Monte Carlo methods, if 
we obtain the generators of the toric ideal of the configuration of this type. 


4 Grobner bases of centrally symmetric configu¬ 
rations of root system D n 

Now we show the structure of the Grobner bases of the centrally symmetric config¬ 
urations for the Box-Behnken designs. Because the Grobner basis is a generator of 
the ideals, we can use the Grobner basis as a Markov basis. As an important fact, 
the transpose of the model matrix for the first-order models for the Box-Behnken 
designs is characterized as the configuration of the root system D n . 

Let ei,..., e n stand for the canonical unit coordinate vectors of R n and C M n 
the finite set which consists of the origin 0 of M n together with 


6j T e,-, e* — e,-, —ej -|- e^, —ej — e^, I vl / <' /' n. 

Let K[ t, t _1 , s] = K[ti ,..., t n , t j” 1 , • • •, t” 1 , s] denote the Laurent polynomial ring 
in 2n + 1 variables over a held Ii. The toric ring of is the subring A'[D±] of 
A'[t, t” 1 , s] which is generated by s together with t i tjS 1 t^HjS, t^t^s, where 

1 < i < j < n. Let A'[x, z] be the polynomial ring over K in the variables z together 
with xfj, where 1 < i < j < n and p,q G {+, —We then define the surjective 
ring homomorphism n : A"[x, z] —> K\D±\ by setting n(z) = s and vr(xfj) = t p t q s, 
where +' = 1 and = —1. For example 7r( x ^" 5 + ) = t^t^s. The toric ideal of is 
the kernel A.* of n. 

Fix an ordering < of the variables of AT[x] with the property that x p J < x r kl if 
either (i) i < k or (ii) i = k and j > i. Let <i ex denote the lexicographic order on 
K [x] induced by the ordering <. We introduce the monomial order -< on A[x, z] 
defined as follows: One has n^=i z<x A nt=i where «,/5 6 Z> 0 , if 

• a + a < b + (3, or 


• a + a = b + (3 and a > (3, or 

• a = b, a = 13 and n?=i x ZJl <lex IlLi X ZZ- 
Let Q denote the set of binomials 

(i) x pq x r k \ - x p k xf e , 1 <i<j<k<£<n; 

(ii) *£*£ ~ <l x ]Z 1 <i<j<k<£<n; 

(hi) x+ p x~ q - x pq k z, \{i,j, k}\ = 30 


1 For j < i, the notation xri is identified with the variable x q ?. 



(iv) - z 2 , 1 < i < j < n\ 

(v) xlj~x^ + - z 2 , 1 < i < j < n; 

(vi) xffxfr - x+ p x“ p , 1 <ifij <n~ 

(vii) x p +x p ~ - 1 <j<n; 

(viii) x$?xf?xf k - x^x^z, l<i,j,k< n, \{i, j, k}\ = 3 
belonging to 

Theorem 4.1. The set Q of binomials is a Grobner basis o// D ± with respect to -< . 


Proof. In general, if / = u — v is a binomial, then u is called the first monomial of 
/ and v is called the second monomial of /. The initial monomial of each of the 
binomials (i) - (viii) with respect to -< is its first monomial. Let in^(^) denote the 
set of initial monomials of binomials belonging to Q. It follows from [14] (0.1)] that, 
in order to show that Q is a Grobner basis of with respect to -<, what we must 
prove is the following: (J|k) If u and v are monomials belonging to A'[x, z] with u v 
such that u 0 in^(C7) and v fL in^(fy), then tt(u) n(v). 

Let u and v be monomials belonging to AT[x, z]. Write 


u = x ? 1 ? 1 • • • xf *V, 

i>aja 7 


v = x 


risi 

kill 


x k b e b z 


with 

k < ■ ■ ■ < ia, h <■■■ <k b . 

Let tt(u) = 7 t(v). Then a + a = b + ( 3 . Suppose that u ^ in_ < (t/) and v ^ in^(^). 
Furthermore, suppose that u and v are relatively prime. Especially either a = 0 or 
j3 = 0. Let, say, a = 0. In other words, 


„. _ xp iqi 
u ~ x iih 


■ x: 


.PaQa 
iaja ’ 


V = X 


r\s\ 

kih 


r r b s b 

X k b h Z ’ 


where (3 = a — b. 

Let i a t < i a rr, where 1 < a' < a" < a. Then, by using (i), one has i a » < j a > and, 
by using (ii), one has j a > < j a " . Hence i a > < i a » < j a > < j a » . It then follows that 

k < ■ ■ ■ < k < ji < ■ ■ ■ < j a , h < ■ ■ ■ < k b < k < ■ ■ ■ < t b . (11) 

We claim that none of the followings arises: 

(tl) V = ia" and Pa' Pa"] 

(0 i a = ji and p a qp 

( t| ) ja' = ja" and q a > q a „. 
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(Case (ff)) Let v = V' and p a > 7 ^ p a ", where 1 < a' < a" < a. Then, by using (iii), 
one has j 0 > = j a " ■ Then, by using (iv) and (v), one has q a > = q a "- Moreover, by using 
(vi), one has i a > = i a » = 1. Thus x Pa j!j a j x Pa j' qa 'j = x^ q j'xjj q j' divides u. Then, by 

using (viii), each variable x Pa j3 a * ( 7 ^ x ij a ,'i 7 ^ x \j a !) which divides u satishes either 
ia* = 1 or \{ja',ia*,ja*}\ = 2 with i a * > 1. Then, by using (iii), if i a * = 1, then 
] a ’ = ja*- Thus q a i 7 ^ q a *. However, by using (iv) and (v), a contradiction arises. 
Hence either j a > = i a * > 1 or j a > = j a * >1. In other words, u is divided by either 

x Va' x ^l' xP j a a 7a* or x Vj' x ^t'' Then > b F usin S ( iii )> = ia*, then q a , = p a ... 
Again, by using (iii), if j„> = j a *, then q a i = q a *. Hence either x t 


= x £j** or 


X 


l a* Ja* 


= x, 


1-a*J, 


As a result, either tj or t- a divides n(u). Let, say, tj divides 
n(u). Since tt(u) = vr(u), it follows that (3 = a — b = 0 and that tj t divides 7 r(w). 

Let, say, either tj or tj 1 , where j 7^ j a i , divide 7 r(u) = tt(v). Then either x xx - or 
xf~ must divide both u and v, which contradicts the fact that u and v are relatively 
prime. Hence 7 t(u) = 1t{v) = tj / s a . Thus a quadratic monomial , where 

J a J a r J J a r J 

jo 7 ^ ja', divides u. Then, since j 0 / > 1, by using (vi), one has jo = 1. Hence 
xtj~,x7j~, divides both u and v, a contradiction. 

x Ja' L Ja' 

(Case (b)) Let i a = j\ and p a 7 ^ q±. Then, by using (iii), one has u G in_,(£/). 


(Case (t])) Let j 0 / = j a » and q a i 7 ^ g a », where 1 < a' < a" < a. Then, by using 
(iii), one has i a > = i a ». Furthermore, by using (iv) and (v), one has p a > = p a ". If 
i a t > 1, then u is divided by x p , a "f l " = x Pa 'j a 'x Pa '~ qa '. Thus, by using (vi), one 
has u G in_<((7). Hence i a i = i a " = 1. Thus, by using (vii), one has j a > = j a " = n. 

Let i a * > 1 for some 1 < a* < n. Then a' < a* and 1 = i a > < ia* < n = ja' < j a * ■ 
Hence j a > = j a * = n. However, since i a * > 1, it follows that q a i = q a *. Thus 
x Pa "‘j a "x Pa *!j a * = x P n~ q °'x Pa * qa '. Since i a * < n, by using (iii), one has u G in_<((/). As 
a result, i a * = 1 for all 1 < a* < a. 

Now, since (j}) cannot occur, one has p\ = p a * for all 1 < a* < a. Thus 7 r(u) is 
divided by either tj or tjj a . Let, say, tj divide 7 r(w). Since 7 r(u) = 7 r(u), it follows 
that (3 = a — b = 0 and tj divides n(v). Let, say, either tj or tj 1 , where j 7 ^ 1, divide 
7 t(u) = 7 t(v). Then either x^ + or xjj~ must divide both u and v, which contradicts 
the fact that u and v are relatively prime. Hence n{u) = 7r(v) = tjs a . Thus a 
quadratic monomial x^x^jj, where jo 7 ^ 1, divides u. Then, by using (vii), one has 
jo = n. Thus x x +xjj- divides both u and v, a contradiction. 

Finally, since none of (j}), (b) and ( tj ) arises, it follows that no cancellation 
occurs in the expression of the Laurent monomial 


<«) = « •••«•• ''I- 

Since 7 r(u) = 7 r(v), one has f3 = a — b = 0. Furthermore, since no cancellation occurs 
in the expression of the Laurent monomial 


tt( v ) — f 1 ... t ra t Sl ... t Sa 
7 l \ v )— l k 1 l k a l h h 0 i 
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it follows from (HIT) that 


H = fy, = £ ( , Pz = r e , q ( = s 5 , 1 < £ < a. 


Recall that u and v are relatively prime. Hence a = 0 and u — v — 1. Consequently 


the required condition (J|k) is satished. 

Example 4.2. Consider the case of n — 3. From Theorem 14.11 


yy»H I - yy» "4" 

x 12 x 13 

— X 23~ Z ) 

yy»H I - yy» 

x 12 x 13 

- x 23 _ 

/y* 1 rr* 

x 12 x 13 

- x ^3 + 

X 12 X 13 

nr* rr*~^ I - 

x 12 x 13 

— x 2Z~ z i 

nr* ~ I - nr* 1 

x 12 x 13 

- 4s" 

nr* /y* - 1 

x 12 x 13 

- x ^3 + 

X 12 X 13 

,y»H nn 

x 12 x 23 

— xfr^Z, 

zy»H I - nr* 

x 12 x 23 

- x 13 _ 

nr* ~ I - nr* ~ 

x 12 x 23 

- x r 3 + 

X 12 +X 23 

/y* 1 /y»H I - 

x 12 x 23 

— X 33 Z, 

rr* 1 nr* 1 

x 12 x 23 

- X+“ Z, 

nr* /y»H I - 

x 12 x 23 

- Xk 3 + Z, 

X 12 X 23 

zy»H I - nr* 1 

x 13 x 23 

— X 12 + ^> 

rr* ^ I - nr* 

x 13 x 23 

- x^ 2 ~Z, 

nr* ~ I - nr* 1 

x 13 x 23 

- xk 2 + z, 

X 13^ X 23 

yy* - ! - /y»~1 I - 

x 13 x 23 

— X 12 + Z i 

yy* - ! - nr* "4~ 

x 13 x 23 

~ x 12~ Z , 

nr* /y* - 1 

x 13 x 23 

~ X 12 z , 

x 13 X 23 + 


x 23 A 

Xn-T z, 


x 13 z, 


++„— 


IV I , I 


12 *°12 




2 +4-- 

y rr* 1 1 rr* 

z i x 13 x 13 


V XU X 


12 *°12 


13 J/ 23 
++„.-+ - 


2 4 - 4 -- 

/y» 1 1 rr* 

* j x 23 x 23 
c 23 X 13 + 


( in \ rr* ' I rr*~ • _ <y» I I /y> I - nr* 

V1 J x 23 x 23 x 12 x 12 ) x 


1 I” -y. 3~ 
x 23 x 23 


Vll X 


++™H— 
12 X 12 


yy.”^ I” yy» 

x 13 x 13 j 

^y.H I - -y. I 

x 13 x 13 ) 


23 X 23 X 
+-- 


x 23 X 23 
X 12 +X 12 


12 X 12 ) 
X 13 X 13 i 


x 13 X 13 


□ 


is the Grobner basis Q of with respect to -<. 

We have the following since the binomial x^^x^x^ — x^x^z (viii) where 1 < 
i,j, k < n and \{i,j, k}\ = 3 satisfies 


T +P T -P T +r _ JH-J" 1 . _ ~+rV +P ~P _ P+ P~\ , P+( P~ +r _ pr \ 

X li X li X jk X^ X ik 4 — X- k X {i X^ X^ ) TJ-jj \X^ X- k X ik 6), 

-\-p —p —r P — pr — v( ~\~P —P P~\~ P —\ i P —/ P~\~ —r pr \ 

rr* 1 rr* 1 rr* _ rr** rr * 1 y - / rr* 1 rr* 1 rr * 1 rr* 1 i _l_ rr** I Op 1 r f t _ rr * 1 y 1 

x l« x li Xj k x^ x ik ~ Aj k j ~r x^j \x^ u,j k x ik ^). 


Corollary 4.3. The toric ideal is generated by binomials (i) - (vii) in Q in 
Theorem |4.R In particular, -^d x is generated by quadratic binomials. 


5 Numerical example 

In this section, we perform our Markov chain procedure to an imaginary data set. 
Our data set is constructed from actual experimental data as follows. In pRj, the 
Box-Behnken design is used to apply the response surface method. The purpose of 
this experiment is to determine the optimal processing condition of a pulsed UV- 
light system to inactivate the fungal spores of Aspergillus niger in corn meal. The 
three factors are A: Treatment time (20, 60, 100 second), B: Distance from the UV 
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strobe (3, 8, 13 cm), and C: Voltage input (2000, 2900, 3800 V). The response is the 
reduction of the Aspergillus niger in the log 10 scale. In [13] , the first-order and the 
second-order polynomial models for the response are considered. See [13] for detail 
description of the data analysis. Because our method is for discrete data, we use the 
rounded values of (10 times of) the responses in this experimental data and treat 
them as realizations of discrete variables. Then we have an imaginary data set in 
Table [0 Because the responses in the original data in JT3] are continuous values, 


Table 1: Box-Behnken design matrix for the three factors and the response (The 
responses are rounded values of Table 1 of |T3] . Fitted values are calculated under 
the model (TTOh ) 


Time (s) 

Distance (cm) 

Voltage (V) 

Response 

Fitted values 

20 

3 

2900 

4 

4.04 

20 

13 

2900 

3 

3.59 

100 

3 

2900 

33 

31.58 

100 

13 

2900 

30 

28.02 

20 

8 

2000 

2 

2.12 

20 

8 

3800 

5 

6.84 

100 

8 

2000 

14 

16.57 

100 

8 

3800 

50 

53.42 

60 

3 

2000 

7 

6.29 

60 

3 

3800 

21 

20.29 

60 

13 

2000 

5 

5.58 

60 

13 

3800 

20 

18.01 

60 

8 

2900 

13 

10.64 


we cannot emphasize our computational results from the applied statistical view. 
The purpose of this numerical experiment is only to check that our method works 
for some discrete data. However, it can also be natural to consider the fitting of 
the log-linear model ([I]) to the response because the original response is reported in 
log 10 scale. 

For the response data in Table |TJ we consider the fitting of the first-order model 
ca based on the likelihood ratio statistics (J5]) . The fitted values under the null 
model is calculated in the last column of Table [I] The likelihood ratio is 2.36 with 
9 degree of freedom. Therefore the asymptotic p value is 0.98 from the asymptotic 
Xg distribution. To evaluate the fitting of the first-order model (HU , we perform the 
Markov chain Monte Carlo method. We use the Grobner basis given in Example 14.21 
as a Markov basis. After 50000 burn-in steps from the observed data as the initial 
state, we derive 100000 samples by Metropolis-Hasting algorithm. Among these 
samples, 98834 samples have the larger likelihood ratio values than the observed 
2.36. Therefore the conditional p value is estimated as 0.99, which suggests the 
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good fitting of the first-order model (fll)|) . Figured] is the histogram of the sample 
likelihood ratio statistics with the asymptotic xl distribution. 


o 



0 5 10 15 20 25 30 35 


Figure 1: Asymptotic and Monte Carlo estimated distribution of the likelihood ratio 
statistics 


6 Discussion 

In this paper, we present a new method for analyzing non-regular fractional factorial 
designs. The motivation of this paper is a new finding on the structure of the 
Grobner bases of the centrally symmetric configurations of root system D n . As 
we have seen in the paper, we can relate the theoretical results in the algebraic 
field to the statistical problems for the Box-Behnken designs. Our model is simple 
and fundamental. In fact, we usually consider more complicated models such as 
second-order model for the analysis of the Box-Behnken designs. However, the 
structure of the Markov bases or the Grobner bases for the second-order model is very 
complicated. Though the Markov chain Monte Carlo methods can be considered for 
general non-regular designs, the structure of the Markov bases is only revealed for 
simple models such as the hierarchical models for the regular designs at present. 
Therefore we think our contribution on the new results of the non-regular designs is 
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important. Besides, compared to the continuous data analysis on the assumption of 
the normality, there are very few experiments are reported treating the discrete data 
arising in the fractional factorial designs. We think our Markov chain Monte Carlo 
procedure is very simple and can be used easily, and can be one of the powerful 
choices in the analysis of the discrete data. 
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